ODC  FILE  COPY  AD  AO  66698 


STRUCTURE  PRESERVING  TRANSFORMATIONS  IN  THE  COMPARISON 
OF  COMPLEX  STEADY-STATE  SOUNDS 


James  H.  Howard,  Jr.,  and  Donald  C.  Burgy 
ONR  CONTRACT  NUMBER  NO  0 01 4-7 5-C- 0 3 0 8 


Technical  Report  ONR-78-6 

-*>  Human  Performance  Laboratory 
Department  of  Psychology 
The  Catholic  University  of  America  f, 

December,  1978 


Approved  for  public  release;  distribution  unlimited.  r 

Reproduction  in  whole  or  in  part  is  permitted  for  any  ;• 

purpose  of  the  United  States  Government.  j*-' 

\ 

03  30  190 


SECURITY  CLASSIFICATION  of  THIS  page  (When  Doto  Entarad) 

REPORT  DOCUMENTATION  PAGE 

1 REPORt  NUMBER  |2.  GOVT 

ONR-78-6 


(2.  GOVT  ACCESSION  NO. 


READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

3.  RECIPIENT'S  CATALOG  NUMBER 


✓ 4.  '''‘•'•I  * — I"’""  J 

/ iklRUCTURE  PRESERVING  JBANSFORMATIONS  IN  THE  h 
.r^pMPARISON  OF  COMPLEX,  SJEADY-jTATEJOUNDS  . / V 


S.  .TYPE  OF 


Technical 


j o 

/ep»t , ; 


17.  AUTHOR^ 


PERFORMING  ORfi..R£PORT  N UMBER 


CONTRACT] 


I James  H./ Howard,  Jr.^CDonald  C.jBurgy  j v£ 

9 PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

The  Catholic  University  of  America 
Washington,  D.  C,  20064 

11.  CONTROLLING  OFFICE  NAME  AND  ADDRESS  — 

Engineering  Psychology  Programs,  Code  455  (// 

Office  of  Naval  Research  ^ — ■ 


Nj#0l4-75-C-J$^ 


10.  PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  A WORK  UNIT  NUMBERS 


NR  197-027 


u monitoring  ag ewer  wrww 

(54  H 


p • 


RESS (it  diUaraot  from  Controlling  Olllco)  15.  SECURITY  CLASS,  (ot  thla  raport) 

Unclassified 

15*.  DECL  ASSIFICATION  ' DOWNGRADING 
SCHEDULE 


16  DISTRIBUTION  STATEMENT  (ol  thla  Raport) 


Approved  for  public  release;  distribution  unlimited 

~ DISTRIBUTION  STATEMENT  ( of  Iho  mbtlrmct  onirbdln  Block  20.  II  different  from  Report; 


[ <•  SUPPLEMENTARY  NOTES 


| ft.  KEY  WORDS  (Continue  on  revere.  aide  it  neceeeery  md  identity  by  block  number) 

auditory  perception 
auditory  pattern  recognition 
feature  extraction 

feature  selection  J / 


20.  ABSTRACT  (Continue  on  rororoo  a I da  II  nocoooory  and  Identity  by  block  number)  prOCeSS -Or  1 en  ted  feature 

selection  model  was  proposed  to  characterize  listeners’  comparisons  of  conplex 
sounds.  Specifically,  the  model  assumes  that  the  listener  performs  a structura 
analysis  on  the  low-resolution  spectra  of  the  stimuli  to  be  compared  and  then 
extracts  a feature  representation  through  a structure-preserving  transformation 
resembling  a principal-components  analysis.  This  feature  representation  is 
subsequently  employed  to  make  similarity  judgments  between  stimuli.  Prediction 
of  the  model  for  a timbre-comparison  task  were  examined  using  a set  of  sixteen^ 


I j*N*7|  1473  L EDITION  OF  I NOV  BB  IB  OBSOLETE 

" S/N  0103-014-  6601 


U ti 


Unclassified 

>EC^lTY,CL ASSlFICATION  OF  THIS  PAOE  fWlen  Dmlo  tntorod) 

J * H-cH  3Vl 


Unclassified 


_,i_cu  HI  T Y CLASSIFICATION  OF  THIS  PAGEflFhwi  Dm (m  Bntmrmd) 


Block  20  continued. 


to  complex  sounds  that  varied  in  amplitude-spectral  shape.  The  subjective 
'v  feature  representation  obtained  from  the  ALSCAL  nonmetric  scaling  program 
was  generally  consistent  with  the  theoretical  feature  representation  produced 
by  the  optimal  structure-preserving  transformation  applied  to  the  loudness- 
weighted  spectra.  The  two  comparison  features  as  well  as  the  relative  impor- 
tance of  the  two  dimensions  were  successfully  predicted  by  the  model.  Prac- 
tical implications  for  the  subjective  evaluation  of  complex  signals  are  dis- 
cussed and  refinements  to  the  transformations  in  the  model  are  suggested  for 
further  research. 


4- 


A variety  of  recent  auditory  psychophysical  studies  have 
required  listeners  to  evaluate  the  subjective  similarity  of  two 
or  more  complex  acoustic  stimuli.  Such  studies  have  involved 


both  speech  (Shepard,  1 972)  and  complex  nonspeech  sounds  (Miller 
& Carterette,  1975;  Howard,  1977;  Grey  & Gordon,  1978).  It  is 
generally  assumed  that  the  similarity  ratings  obtained  in  this 
situation  reflect  the  outcome  of  a peceptual  comparison  based  on 
one  or  more  psychophysical  features  that  characterize  members  of 
the  stimulus  set.  Typically,  standard  metric  and  nonmetric 
multidimensional  scaling  techniques  are  used  to  extract  a set  of 
perceptual  dimensions  from  the  observed  matrix  of  similarity 
judgments.  The  dimensions  revealed  in  this  analysis  are  thought 
to  reflect  the  elementary  perceptual  units  or  features  that  the 
listeners  used  to  compare  the  stimuli.  An  important  implicit 
assumption  in  this  research  is  that  human  listeners  can  reliably 
report  the  perceived  similarity  among  sounds  even  though  they 
may  not  be  explicitly  aware  of  the  underlying  stimulus  features. 
As  Plomp  and  his  associates  have  indicated  (Plomp,  1976),  these 
methods  have  contributed  significantly  to  our  understanding  of 
the  processes  involved  in  timbre  perception. 

Th e Feature  Selection  Pr oblem 

The  specific  question  addressed  in  the  present  paper 
concerns  the  perceptual  features  listeners  use  to  make  pairwise 
similarity  judgments  on  a set  of  sixteen  complex  steady-state 
sounds  that  differ  primarily  in  timbre.  How  are  the  elementary 
units  of  comparison  determined?  What  criteria  do  listeners  use 
to  select  a subset  of  all  possible  dimensions  for  comparing  the 


- ' *.  A ' * ' *- 

■ • ■ V.  - * 


Y 

'r 

I 


* 

*J 


I 


Feature  Selection 


Page  2 


individual  members  of  the  stimulus  set? 

Howard  and  Balias  (1  978)  have  referred  to  this  as  the 
feature  selection  problem.  Two  contrasting  approaches  to  this 
problem  have  been  suggested  in  the  literature.  First,  the  human 
auditory  system  may  be  equipped  with  a set  of  specific  feature 
detecting  mechanisms  that  monitor  incoming  aural  information  for 
particular  stimulus  cues  (e.g.,  Barlow,  1972).  This  approach 
emphasizes  the  importance  of  the  feature  detectors  themselves. 

Each  detector  "looks  for"  an  individual  stimulus  property,  and  a 
set  of  feature  detectors  determines  a property  list  for  the 
stimulus.  Howard  and  Balias  (19  78)  referred  to  this  as  the 
property- list  approach.  Second,  it  is  possible  that  the 
auditory  system  has  an  internalized  set  of  rules  and  criteria 
for  feature  selection  rather  than  a set  of  finely-tuned  feature 
detectors.  These  rules  and  processes  enable  the  listener  to 
determine  what  the  comparison  features  should  be  in  any 

particular  stimulus  context.  This  view  was  called  the 
process-oriented  approach  (Howard  & Balias,  1978). 

Although  evidence  supporting  both  positions  can  be  found  in 
the  literature,  Howard  and  Balias  (1  978)  argue  that  the 
process-oriented  approach  is  more  naturally  suited  for 
theorizing  about  the  timbre  comparison  task.  While  it  may  be 
reasonable  to  argue  that  man  has  evolved  specialized  brain 

"filters"  for  certain  aural  cues  (e.g.,  speech  features),  an  * 

' 

extension  of  this  argument  to  include  detectors  for  the  »' 

I 

individual  timbre  attributes  of  complex  tones  resists 

y 

credibility.  Since  timbre  obviously  encompases  a large  set  of 


Feature  Selection 


Page  3 


perceptual  attributes  (Plomp,  1976;  von  Bismarck,  1974),  at  the 
very  least,  a prope rty- 1 is t approach  to  timbre  comparison  would 
suffer  an  embarrasing  lack  of  parsimony.  Furthermore,  if  we 
were  to  argue  that  only  a subset  of  detectors  would  be  used  in 
any  particular  comparison  task  then  we  would  still  be  obliged  to 
explain  how  that  subset  is  selected  by  the  listener. 
Consequently,  in  the  present  paper  we  adopt  a process-oriented 
approach  to  the  feature  selection  problem.  In  other  words, 
rather  than  searching  for  the  set  of  invariant  auditory  feature 
detectors  that  underlie  timbre  comparison,  we  will  attempt  to 
outline  some  general  principles  that  would  account  for  feature 
selection  in  a variety  of  comparison  contexts. 

Toward  £ Model  of  Feature  Selection 

In  their  recent  treatment  of  this  problem,  Howard  and 
Balias  (1978)  argue  that  when  asked  to  compare  the  timbre  of 
steady-state  sounds,  human  listeners  perform  a structural 
analysis  on  the  low-resolution  spectra  of  the  comparison 
stimuli.  In  this  case,  the  feature  selection  process  may  be 
thought  of  as  a structure  preserving  transformation  that  maps 
stimuli  from  an  initial  low-level  representation  (the 
measurement  representation)  onto  a higher-order  representation 
of  lower  dimensionality  (a  feature  representation).  In  the  case 
of  steady-state  complex  sounds  it  may  be  argued  that  the 
measurement  representation  is  approximated  by  a 1/3-octave 
spectral  analysis,  adjusted  for  unequal  sensitivity  across  the 
spectrum  (Zwicker,  Flottorp  & Stevens,  1957).  Although  in 
general  it  is  evident  that  information  will  be  lost  with  such  a 


Feature  Selection 


Page  4 


transformation/  Howard  and  Balias  (1978)  argue  that  listeners 
select  the  comparison  features  so  as  to  minimize  this  loss.  In 
other  words,  in  a comparison  task,  features  are  selected  to 
account  for  as  much  of  the  variability  among  the  measurement 
representations  of  the  stimuli  as  possible.  They  point  out  that 
a transformation  having  these  properties  is  very  similar  to  a 
principal  components  analysis. 

A principal  components  analysis  provides  a transformation 
that  maps  objects  from  one  space  into  a subspace  of  lower 
dimensionality.  The  first  principal  component  is  simply  a new 
axis  in  the  original  space  that  accounts  for  most  of  the 
variability  among  the  objects.  In  other  words,  the  set  of 
projections  of  objects  in  the  measurement  space  onto  the  first 
principal  component  has  maximum  variance.  The  second  principal 
component  is  an  axis  orthogonal  to  the  first  that  accounts  for 
most  of  the  residual  variance  and  so  on  (Harris,  1 9 75). 

Given  these  arguments,  we  can  construct  a preliminary  model 
of  the  stimulus  comparison  process  for  steady-state  complex 
sounds.  Figure  1 displays  an  outline  of  our  approach. 


Insert  Figure  1 here 

An  initial  measurement  transformation,  denoted  M , determines 
a measurement  representation  from  the  time-domain  stimuli.  ’ We 
assune  that  M reflects  a low-resolution  spectral  analysis, 
and  denote  the  measurement  representation  for  stimulus  S|  by  a 
column  vector  of  m 1/3-octave  band  levels,  X|  * M^\)>  where 


Figure  1.  Preliminary  three-stage  model  of  the  aural  comparison  process. 


Feature  Selection 


Page  6 


xj  = (x^  / x|2  ' • • • / X|m  ) • After  tn  is,  a second  transformation, 
T , occurs  that  extracts  a set  of  comparison  features  from  the 
measurement  vectors.  In  our  model  we  assume  that  the  outcome  of 
this  transformation  is  a column  vector  of  n feature  values  for 


That 


where 


each  stimulus.  That  is,  = T (x.),  where 

f j = (f  j,  ' fj2  , • . fjn)  with  n < m.  Once  the  feature 
information  is  available,  the  listener  compares  the  stimuli  to 
determine  a similarity  judgment,  Ci fr  fj). 

The  heart  of  the  feature  selection  problem  involves 
specifying  the  transformation  T . Following  Howard  and  Balias 
(1978),  we  have  argued  that  this  transformation  reflects  the 
outcome  of  a structural  analysis  of  the  stimulus  set,  much  like 
a principal  components  analysis.  This  assumption  specifies  four 
important  properties  of  the  transformation.  First,  the 
transformation  is  linear.  Since  the  features  represent  new 
dimensions  in  the  measurement  space,  the  transformation  must 
project  each  stimulus  in  the  measurement  space  onto  the  new 
dimensions.  The  feature  values,  _fj,  for  stimulus  S|  are 
therefore  weighted  linear  combinations  of  the  original 
measurements,  Xj.  In  matrix  notation,  each  vector  of  feature 
values  is  the  product  of  a measurement  vector  and  an  n by  m 
matrix  of  weights  or  coefficients,  T,  fj  * T Xj  , or 


tn  t12  • • • tim  \ lx  j1 


^21  ^22 


■2m  *i2 


fcm  tn2  * * * lnm  / \ x|m 


i 


Feature  Selection 


Page  7 


In  this  view,  the  jth  feature  coordinate  for  stimulus  Sj , fjj  , 

is  determined  by  the  inner  product  of  the  jth  row  vector  of  T 

m 

and  the  measurement  vector  for  that  stimulus,  f = s = V*  t. .■  x.. 

k-i  )k  lk 

Second,  the  transformation  should  project  stimuli  from  the 
m-d im ens ional  measurement  space  onto  the  n-d imens iona 1 feature 
space  while  preserving  as  much  of  the  original  information  as 
possible.  This  is  achieved  by  selecting  transformation 
coefficients,  T,  such  that  the  variance  of  stimulus  projections 
onto  each  dimension  is  maximal. 

Third,  the  transformation  coefficient  vector  for  each 
feature  (i.e.,  each  row  in  the  T matrix)  should  be  of  unit 
length.  This  restriction  is  required  to  avoid  trivially 
satisfying  the  second  condition  by  selecting  arbitrarily  large 
coefficients . 

Fourth,  the  transformation  coefficient  vectors  should  be 
mutually  orthogonal.  Since  the  primary  function  of  the  feature 
transformation  is  to  eliminate  redundancy  in  the  measurement 
representations,  it  is  obviously  desirable  that  the  features 
carry  as  little  overlapping  information  as  possible.  Together 
with  the  third  condition,  this  specifies  that  the  vectors  of 
projection  coefficients  be  orthonormal,  i.e.,  orthogonal  and  of 
unit  length. 

Transformations  having  these  properties  are  frequently 
encountered  in  the  theoretical  pattern  recognition  literature, 
and  represent  a particular  instance  of  the  discrete 
Ka rhunen-Loeve  expansion  (Meisel,  1972;  Young  & Calvert,  1974). 
Fortunately,  the  desired  transformation  coefficients  are  readily 


Feature  Selection 


Page  8 


obtained  by  decomposing  the  symmetric  m by  m covariance  matrix 
of  stimulus  measurements  (in  our  case  the  1/3-octave  band 
levels)  using  standard  techniques.  The  normalized  eigenvectors 
resulting  from  this  decomposition  provide  the  transformation 
coefficients,  and  the  corresponding  eigenvalues  indicate  the 
relative  importance  of  each  eigenvector.  An  optimal 
n-d imens ional  feature  space  may  then  be  determined  by  selecting 
the  n eigenvectors  having  the  largest  eigenvalues. 

More  specifically,  to  decompose  the  covariance  matrix  we 
need  to  solve  the  well-known  eigenvalue  problem 
JS  £ j = cxj  e.  i = 1,  2,  . . . , m. 

where  K represents  the  covariance  matrix,  jejj.  represents  a set 
of  m orthogonal  solution  vectors,  called  eigenvectors,  and 
| } are  a set  of  m associated  scalars  called  eigenvalues 

(Green  & Carroll,  197  6).  In  the  present  context,  the 
eigenvectors  indicate  the  new  dimensions  in  the  feature  space 
and  the  ith  eigenvalue  reflects  the  variability  of  stimulus 
projections  onto  the  ith  feature  dimension.  Although  m 
eigenvectors  exist  for  an  m by  m covariance  matrix,  a more 
efficient  stimulus  representation  can  be  obtained  by  discarding 
the  eigenvectors  that  account  for  relatively  little  of  the 
stimulus  variability.  To  the  extent  that  redundancy  exists  in 
the  original  measurements,  the  information  in  the  stimulus  can 
be  adequately  portrayed  with  fewer  dimensions  in  the  feature 
space  than  in  the  measurement  space  (i.e.,  n < m in  the  notation 
developed  above).  Once  we  have  selected  the  n eigenvectors, 
these  values  determine  the  coefficients  in  the  transformation 


u 


Feature  Selection  Page  9 

matrix  T'  i .e . , 

1 = 

Rat ionale 

In  the  present  paper  we  will  examine  the  above  model  as  a 
characterization  of  the  feature  selection  process  for  human 
listeners  in  a timbre  comparison  task.  Since  it  is  well  known 
that  the  shape  of  the  amplitude  spectrum  is  the  primary  physical 
correlate  of  timbre  (Plomp,  1976),  any  model  that  describes  the 
feature  selection  process  must  account  for  its  effects  on  the 
psychological  feature  representation.  Because  we  are  primarily 
interested  in  the  transformations  involved  in  timbre  perception, 
sixteen  complex,  steady-state  sounds  that  differ  in  spectral 
shape  will  be  used  in  the  present  experiment.  The  sounds  were 
synthesized  by  combining  individual  sinusoidal  components  at 
1/3-octave  intervals.  These  intervals  were  selected  since  it  is 
generally  accepted  that  the  ear  resembles  a set  of  1/3-octave 
filters  in  its  frequency  resolving  power  (Plomp,  1976).  The 
amplitude  spectra  were  shaped  by  combining  the  components  at 
various  amplitudes.  All  sixteen  sounds  had  two  spectral  peaks 

> 

or  formants  of  differing  peak  ratio  and  distinctiveness. 

f 

The  set  of  sixteen  complex  sounds  will  be  presented  to 
listeners  for  pairwise  similarity  judgments.  A measurement 
vector  (x .)  will  be  obtained  for  each  sound  by 


Feature  Selection 


Page  10 


k 


amplitude-weighting  the  1/3-octave  band  levels  using  Steven's 
loudness  function  (Stevens,  1972).  These  loudness-adjusted 
spectra  will  be  analyzed  according  to  the  procedures  outlined 
above  to  determine  an  optimal  structure-preserving  feature 
transformation.  The  feature  representation  predicted  by  this 
theoretical  analysis  will  be  compared  to  the  subjective  feature 
representation  observed  in  the  experiment  to  test  the  adequacy 
of  the  model. 

The  feature  representation  actually  used  by  the  listeners 
will  be  estimated  by  submitting  the  observed  similarity  matrices 
to  a nonmetric  multidimensional  scaling  analysis.  In 
particular,  the  ALSCAL  program  (Takane,  Young  & de  Leeuw,  1977) 
will  be  used  to  decompose  the  data  into  an  n-d imens ional  metric 
space  in  which  each  stimulus  is  represented  as  a single  point  or 
vector.  The  dimensions  revealed  in  this  analysis  will  be  taken 
to  reflect  those  features  that  the  listeners  employed  to  compare 
the  sounds. 

Method 

Subjects 

Six  undergraduate  student  volunteers  (5  males  and  1 female) 
were  paid  an  average  of  $3.00  per  hour  for  their  participation. 
All  students  had  some  musical  background;  however,  none  had 
taken  formal  training  in  the  last  three  years.  The  volunteers 
reported  no  history  of  hearing  disorders 
Apparatus 

All  experimental  events  were  controlled  by  a Digital 
Equipment  Corporation  PDP-8/e  computer.  Statistical  analyses 


t 


Y 

r 


Feature  Selection 


Page  11 


were  carried  out  on  the  Catholic  University's  DECSys  tem-10 
computer  using  the  IMSL  statistical  library,  and  the  ALSCAL 
multidimensional  scaling  program  (Takane  et  al. , 1977). 

Listeners  were  isolated  in  a sound-attenuated  booth  during 
the  experiment.  A video  display  was  used  to  present  verbal 
feedback  and  instructions,  and  listeners  entered  their  responses 
on  a solid-state  keyboard.  A 12-bit  digital-to-analog  converter 
(Digital  Equipment  Corporation  AA5  0)  was  used  to  output  the 
complex  auditory  waveforms  at  a sampling  rate  of  10  kHz. 
Synthesized  waveforms  were  low-pass  filtered  (Krohn-Hite  Model 
3550)  with  an  upper  cutoff  frequency  of  4 kHz  to  remove  aliasing 
frequencies.  The  sounds  were  passed  through  a programmable 
attenuator  (Texscan  PA-50)  before  being  presented  over  matched 
headphones  (Telephonies  TDH-49,  MX41/AR  cushions). 

St imuli 

Sixteen  complex  steady-state  sounds  were  constructed 
digitally  by  adding  together  22  individual  sinusoidal 
components.  As  indicated  above,  these  components  were  spaced  at 
1/3-octave  intervals  between  20  and  2500  Hz.  Two  parameters, 
peak  ratio  and  peak  smear  were  varied  to  produce  amplitude 
spectra  of  different  shapes.  The  resulting  spectra  had  maxima 
at  5 00  and  1 000  Hz  with  peak  amplitude  ratios  of  1.00,  .9  0,  . 80, 

or  .70  on  a logarithmic  scale.  The  amplitudes  of  the  remaining 
frequency  components  were  determined  by  Gaussian  distributions 
centered  at  the  two  peak  frequencies.  Peak  smear  was 
manipulated  by  varying  the  standard  deviation  of  the 
distributions  (50,  1 00,  200,  or  4 00  Hz).  Thus,  the  spectra  had 


Feature  Selection 


Page  12 


two  distinct  peaks  with  small  standard  deviations,  but  appeared 
smeared  with  large  standard  deviations.  It  should  be  noted  that 
the  two  parameters  are  not  orthogonal  since  either  a low  peak 
ratio  or  a large  standard  deviation  would  produce  a more  uniform 
spectrum.  The  four  extreme  spectra  produced  by  the  combination 
of  these  two  dimensions  are  displayed  in  Figure  2,  and  the 
physical  parameters  for  each  stimulus  are  presented  in  Table  1. 


Insert  Figure  2 and  Table  1 here 


The  stimuli  were  equated  subjectively  for  loudness  by  a 
preliminary  group  of  listeners  who  did  not  participate  in  the 
experiment.  The  loudness-equated  sounds  were  presented  at 
levels  of  between  76  and  78  dB  SPL. 

Procedure 

Participants  were  seated  in  the  sound-attenuated  booth  and 
were  given  typewritten  instructions.  After  the  listeners 
understood  the  instructions,  the  complex  set  of  sixteen  sounds 
were  presented  four  times  in  order  to  familiarize  the  person 
with  the  sounds  they  were  to  compare.  The  listener  was 
instructed  that  he  or  she  was  to  compare  the  stimuli,  and  assign 
a rating  of  "5"  if  the  two  sounds  were  very  similar  or  a rating 
of  "1"  if  the  sounds  were  very  dissimilar.  The  ratings  between 
1 and  5 were  to  be  used  for  pairs  of  intermediate  similarity. 
After  the  initial  familiarization  period,  the  listeners 
participated  in  the  comparison  task  for  three  days  in  one-hour 
sessions.  At  the  end  of  the  third  day,  a brief  sound-sorting 


o o n> 

w 40  *• 

•“  w 


o 

o 

10 


o 

o 


o 

o 

o 


o 

ID 

CM 


o 

o 


o 

o 

o 

CM 


FREQUENCY  (Hz) 


Figure  2.  Four  extreme  spectra  (sounds  1,  4,  13,  and  16)  produced  by 
the  combination  of  the  peak  ratio  and  peak  smear  parameters. 


Page  14 


ik 

! 


Table  1 

Physical  Parameter  Values  Used  to  Generate  Each  of  the  Sixteen  Test  Sounds. 
Peak  Ratio  Refers  to  the  Amplitude  in  dB  of  the  1000  Hz  Peak  Relative 
to  the  500  Hz  Peak.  Peak  Smear  is  Expressed  in  Standard  Deviation 

Units  (Hz). 


Sound 

Peak  Ratio 

Peak 

1 

1.00 

50 

2 

1.00 

100 

3 

1.00 

200 

4 

1.00 

400 

5 

.90 

50 

6 

.90 

100 

7 

.90 

200 

8 

.90 

400 

9 

.80 

50 

10 

.80 

100 

11 

.80 

200 

12 

.80 

400 

13 

.70 

50 

14 

.70 

100 

15 

.70 

200 

16 

.70 

400 

p 

{ 


Feature  Selection 


Page  15 


task  was  given  in  which  the  listener  had  to  order  the  sounds 
from  lowest  to  highest  pitch  by  making  pairwise  judgments.  The 
participants  were  not  informed  of  the  pitch-sorting  task  until 
after  the  third  scaling  session.  The  pitch-sorting  task  was 
included  to  assess  the  possible  role  of  pitch  in  the  similarity 
d a ta . 

Each  trial  in  the  similarity  rating  task  began  when  the 
word  LISTEN  appeared  on  the  video  display.  After  a brief  delay, 
successive  three  second  samples  of  the  comparison  sounds  were 
presented  with  a one  second  in te rst imul us  interval.  After  the 
second  stimulus  was  presented,  the  words  RATE  SIMILARITY  were 
displayed.  Listeners  were  allowed  unlimited  time  to  make  their 
response;  however,  most  responded  within  four  seconds.  After 
the  listener  reponded,  the  display  was  cleared  and  the  next 
trial  began.  Each  of  the  120  possible  stimulus  pairs  were 
presented  twice,  counterbalanced  for  order  of  presentation. 
This  procedure  was  repeated  on  each  of  the  three  successive 
days . 


At  the  end  of  the  third  day,  listeners  participated  in  the 
pitch-sorting  task.  Before  beginning,  each  of  the  sixteen 
sounds  was  played  to  review  the  entire  set.  On  each  rating 
trial,  the  participant  saw  the  word  LISTEN  followed  by  a 
stimulus  pair,  and  then  the  words  WHICH  SOUND  WAS  LOWEST  IN 
PITCH  (I.E.,  MORE  BASS  SOUNDING)?,  were  displayed.  The  listener 
then  pressed  "1"  if  the  first  sound  was  lower  than  the  second, 
or  "2"  if  the  second  was  lower  than  the  first.  The  listener 


could  repeat  the  trial  by  pressing 


a 


key  marked 


S 


A 


Feature  Selection 


Page  16 


bubble-sort  algorithm  was  employed  to  sort  the  sounds  using  the 
pairwise  pitch  ratings.  After  the  sort  was  complete,  the 
listener  heard  all  of  the  sounds  in  the  pitch  ordering  that  he 
had  determined.  If  the  listener  was  not  satisfied  with  this 
ordering,  the  above  task  could  be  repeated.  However,  all 
listeners  required  only  one  pass  to  achieve  a satisfactory 
sorting.  Sound  pairs  were  presented  in  a different  random  order 
for  each  listener. 


Results  and  Discussion 

Th  eoret ica 1 Analysis 

A predicted  feature  representation  was  obtained  by  applying 
the  model  to  the  22-element  measurement  vectors  (x.)  for  the 
sixteen  sounds.  As  indicated  in  the  introduction,  the  predicted 
features  are  simply  principal  component  axes  obtained  in  an 
eigen-analysis  of  the  measurement  covariance  matrix.  The 
variance  acconted  for  by  each  axis  or  feature  is  given  by  the 
corresponding  eigenvalue.  In  the  present  case,  the  first  two 
principal  components  accounted  for  918  of  the  overall  stimulus 
variability  (74%  and  17%  for  the  first  and  second  principal 
components,  respectively).  Since  the  third  principal  component 
accounted  for  less  than  6%  of  the  overall  variance,  it  was  not 
considered  further.  This  analysis  indicates  that  listeners  need 
only  use  two  comparison  features  to  account  for  most  of  the 
variability  in  the  present  stimuli. 

The  normalized  transformation  coefficients  obtained  in  this 
analysis  are  displayed  in  Table  2.  As  indicated  in  the 
introduction,  the  feature  projections  for  any  stimulus  are 


i 


r- 


r 


) 

J 


Feature  Selection 


Page  17 


obtained 


f rom 


the 


transformation  matrix 


*; 

ti 


Th  e left 


sixteen  stimuli  in  the 


matrix  equation/  f j * T Xj,  where  the 
is  given  by  the  coefficients  in  Table  2, 
half  of  Figure  3 displays  a plot  of  the 
predicted  two-dimensional  feature  space. 


Insert  Figure  3 and  Table  2 here 


It  is  obvious  from  an  examination  of  Figure  3 that  the 
first  principal  component  (Dimension  1)  is  related  to  the  peak 
smear  parameter  used  to  generate  the  stimuli.  Stimuli  having 
the  least  smear  (1/  5,  9,  13)  appear  at  the  far  right  along  this 
dimension/  whereas  stimuli  having  the  greatest  smear  (4,  8,  12/ 
16)  appear  on  the  extreme  left.  It  is  also  evident/  however/ 
that  stimuli  having  the  same  peak  smear  do  not  have  identical 
Dimension  1 coordinates.  For  example/  sounds  3,  7,  11,  and  15 
were  all  synthesized  with  a standard  deviation  of  200  Hz,  but 
have  differing  coordinates  along  this  dimension.  It  appears, 
then,  that  although  Dimension  1 is  determined  primarily  by  peak 
smear,  it  also  depends  on  the  peak  ratio  parameter. 

A more  complete  description  of  this  predicted  feature  may 
be  obtained  by  examining  the  t,  coefficient  vector  in  Table  2. 
Since  the  Dimension  1 projection  for  any  stimulus  is  simply  a 
weighted  linear  combination  of  its  22  band-level  measurements, 
the  coefficients  indicate  the  relative  importance  of  each 
individual  band  level.  For  Dimension  1,  the  two  frequency  bands 
lying  between  the  500  and  1000  Hz  peaks,  630  and  800  Hz,  have 
the  largest  coefficients.  This  is  generally  consistent  with  our 


Page  19 


Table  2 

Normalized  Transformation  Coefficients  (x  105)  for  Each  Frequency  Component 
Obtained  in  a Theoretical  Analysis  of  the  Sixteen  Sounds.  The  Two  Co- 
efficient Vectors,  t^  and  t ? Form  the  Predicted  Transform  Matrix  T.a 


Component 

Frequency  (Hz) 

-1 

—2 

1 

20 

-38 

-1 

2 

25 

-39 

-1 

3 

31.5 

-40 

-1 

4 

40 

-41 

0 

5 

50 

-42 

0 

6 

63 

-44 

1 

7 

80 

-46 

2 

8 

100 

-55 

5 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 


125 

-75 

160 

-98 

200 

-129 

250 

-144 

315 

-174 

400 

-206 

500 

-- 

530 

800 

|-236  J 
|-21S  | 

1000 

-66 

1250 

-136 

1600 

-44 

2000 

-8 

10 

21 

39 

63 

114 


22 


2500 


0 ' 0 


obsg^omgonent  amplitudes  were  equated  at  the  500  Hz  peak, hence. 


zero  coefficients  were 


Feature  Selection 


Page  20 


observation  that  Dimension  1 is  most  closely  related  to  the  peak 
smear  parameter.  However,  the  amplitude  of  components  lying 
between  the  two  peaks  will  be  determined  by  both  peak  ratio  and 
peak  smear.  These  values  may  be  increased  by  either  (1) 
increasing  the  smear,  as  in  moving  from  sound  1 to  sound  4,  or 
(2)  increasing  the  peak  ratio,  as  in  moving  from  sound  15  to 
sound  3.  In  summary,  the  model  predicts  that  listeners  will 
focus  on  the  intensity  of  components  between  the  two  peaks  as  a 
primary  comparison  feature.  It  is  interesting  to  note  in  this 
context  that  a similar  "peak  distinctiveness"  perceptual  feature 
was  described  by  Howard  ( 1977)  in  an  earlier  psychophysical 
investigation  of  complex  sounds. 

When  the  second  dimension  is  considered,  a similar  picture 
emerges.  Examination  of  Figure  3 clearly  indicates  that  the 
stimulus  coordinates  along  this  dimension  are  determined  by  an 
interaction  of  the  peak  ratio  and  peak  smear  parameters. 
Although  the  peak  ratio  rank  ordering  is  maintained  within 
groups  of  four  stimuli,  the  absolute  Dimension  2 coordinate 
depends  on  peak  smear  as  well.  As  with  Dimension  1,  a clearer 
understanding  of  this  feature  may  be  obtained  by  examining  the 
t^2  coefficient  vector  in  Table  2.  It  is  interesting  that 
frequencies  below  4 00  Hz  contribute  very  little  to  this  feature 
value.  In  contrast,  the  bands  adjacent  to  the  500  Hz  peak,  400 
and  630  Hz  have  large  positive  coefficients,  and  the  1000  and 
1 250  Hz  bands  have  very  large  negative  weights.  When  these 
coefficients  are  applied  to  transform  the  1/3-octave  spectra  for 
our  sounds,  a "relative  pitch"  dimension  emerges.  In 


Feature  Selection 


Page  21 


particular,  sounds  having  relatively  greater  high  frequency 
energy  (i.e.,  1000  and  1250  Hz  region)  will  produce  large 
negative  coordinates  for  this  feature.  For  example,  since  sound 
5 has  a lower  peak  ratio  than  sound  1,  it  is  clear  that  it  will 
have  relatively  less  energy  in  the  high  frequency  region.  When 
sound  2 is  compared  to  sound  1,  however,  we  must  consider  the 
role  of  peak  smear.  In  this  case,  the  100  Hz  standard  deviation 
used  to  smear  the  peaks  in  sound  2 effectively  increases  the  low 
frequency  energy  relative  to  the  high  frequency  energy.  This 
occurs  because  of  the  wider  1/3-octave  intervals  in  the  high 
frequency  region.  In  contrast,  when  sound  4 is  compared  to 
sound  2,  the  broader  peak  smearing  used  for  sound  4 (400  Hz 
standard  deviation)  also  increases  the  amplitude  of  the  more 
heavily  weighted  1 250  Hz  component.  The  net  result  is  that  the 
Dimension  2 coordinate  for  sound  4 is  somewhat  more  negative 
than  that  for  sound  2.  Subjectively,  we  can  say  that  listeners 
are  expected  to  compare  the  overall  pitch  of  the  stimuli  within 
the  four-stimulus  clusters  along  Dimension  1.  The  expected 
interaction  of  peak  smear  and  peak  ratio  is  clearly  evident  in 
the  inverted  "U"  distribution  of  stimuli  in  the  predicted 
feature  space. 

To  summarize,  the  feature  selection  model  predicts  that 
listeners  will  need  only  two  comparison  features  to  adequately 
peform  the  similarity  judgment  task  for  these  stimuli.  More 
specifically,  we  expect  the  intensity  of  inter-peak  components 
or  peak  distinctiveness  to  be  particularly  important  (Dimension 
1).  The  second,  but  less  important  feature  should  reflect  the 


* 


Feature  Selection 


Page  22 


relative  amount  of  high  versus  low  frequency  energy  in  the 
sounds. 

Peceptual  Analysis 

A full  16  by  16  matrix  of  similarity  judgments  was  obtained 
from  each  listener  on  each  of  the  three  sessions.  The  first 
session  was  viewed  as  practice,  and  these  data  were  not 
considered  further.  Data  were  summed  across  the  remaining  two 
sessions  to  yield  a single  proximity  matrix  for  each  listener. 
These  data  were  checked  for  consistency  by  computing  a Pearson 
product-moment  correlation  between  the  upper  and  lower  halves  of 
each  matrix.  The  average  correlaion  was  .7  0,  with  five  of  the 
six  listeners  showing  a correlation  of  .64  or  better.  This  was 
taken  to  indicate  that  the  listener's  similarity  judgments  were 
sufficiently  stable  to  justify  further  analysis. 

The  summed  matrix  for  each  individual  was  submitted  to  a 
nonmetric  individual  differences  ALSCAL  analysis.  The  selected 
nonmetric  scaling  model  required  that  we  only  assume  ordinal 
level  measurement  in  the  initial  subjective  proximity  matrices. 
In  addition,  the  individual  differences  model  provides  a 
saliency  vector  for  each  listener  that  indicates  the  relative 
importance  of  each  dimension  for  that  person.  The  latter 
property  will  enable  us  to  assess  individual  listener 
cons  is  tency . 

The  two-dimensional  ALSCAL  solution  provided  an  adequate 
representation  of  the  subjective  similarity  data.  Although  the 
observed  stress  (18.6%)  was  only  in  the  "fair"  range  according 
to  Kruskal  (1964),  the  addition  of  a third  dimension  resulted  in 


Feature  Selection 


Page  23 


little  improvement  in  either  stress  (5?)  or  in tepr etabil ity.  In 
addition.  Young  (1970)  has  pointed  out  that  for  proximity  data 
containing  any  sampling  error,  stress  tends  to  increase  with  the 
number  of  data  points.  This  occurs  despite  the  fact  that  the 
scaling  solution  may  actually  recover  most  of  the  underlying 
metric  information.  The  stimulus  space  obtained  in  our  scaling 
analysis  is  displayed  in  the  right  half  of  Figure  3.  These  data 
will  be  discussed  in  terms  of  the  theoretical  predictions 
outlined  above. 

The  finding  that  a two-dimensional  scaling  solution  was 
adequate  suggests  that  our  listeners  employed  two  comparison 
features.  This  was  predicted  in  our  theoretical  analysis.  It 
was  further  predicted  that  the  two  dimensions  would  differ 
widely  in  their  relative  importance.  Theoretically,  Dimension  1 
accounted  for  74?  of  the  stimulus  variability  and  Dimension  2 
accounted  for  only  17?  of  the  variability.  A similar  result  was 
observed  in  our  scaling  analysis  of  the  perceptual  data.  All 
six  listeners  placed  relatively  greater  emphasis  on  one 
dimension  (Dimension  1 in  Figure  3)  than  on  the  other. 

An  initial  visual  comparison  of  the  theoretical  and 
observed  feature  spaces  in  Figure  3 reveals  both  similarities 
and  differences.  First,  it  is  apparent  that  the  overall 
configuration  of  stimuli  is  similar  in  the  two  spaces.  In  both 
cases,  the  stimuli  have  an  inverted  "U"  distribution,  albeit 
more  pronounced  in  the  subjective  space,  and  the  stimulus 
projections  onto  the  two  axes  are  generally  comparable.  The 
Pearson  product-moment  correlations  between  the  corresponding 


Feature  Selection 


Page  24 


coordinates  in  the  two  spaces  are  consistent  with  this 
observation  (r  = .94,  £(14)  = 9.91,  £ < .01,  and  r = .82, 
£(14)  = 5.46,  £ < .01  for  Dimensions  1 and  2,  respectively). 
The  actual  theoretical  and  observed  coordinates  for  both 
dimensions  are  presented  in  Table  3. 


Insert  Table  3 here 


We  may  conclude,  therefore,  that  the  feature  selection  model 
outlined  above  successfully  predicted  the  signal  attributes  that 
listeners  would  use  to  compare  the  sounds. 

However,  despite  this  overall  consistency,  a number  of 
important  differences  exist  that  deserve  further  comment.  With 
regard  to  Dimension  1,  it  is  obvious  in  Figure  3 that  listeners 
did  not  clearly  distinguish  the  four  stimulus  clusters  predicted 
by  the  model.  Rather,  the  listeners  tended  to  dichotomize 
stimuli  along  this  dimension,  maintaining  a large  peceptual 
difference  between  the  low  peak  smear  stimuli  (50  and  100  Hz 
standard  deviation)  and  the  high  peak  smear  stimuli  (200  and  400 
Hz  standard  deviation).  It  is  interesting  to  note,  however, 
that  the  predicted  between-cl us  ter  rank  orderings  are  observed 
in  all  cases.  Mean  Dimension  1 projections  of  1.28,  .64,  -.82, 

and  -1.09  were  observed  for  the  four  expected  clusters 
(1-5-9-13,  2-6-10-14,  3-7-11-15,  and  4-8-12-16,  respectively), 

and  in  no  instance  did  the  clusters  overlap  along  this 
dimension.  It  appears,  then,  that  our  listeners  made  somewhat 
cruder  stimulus  distinctions  in  the  peak  distinctiveness  feature 


1 

, + 

r 

Table  3 

Page  25 

Predicted  and  Obtained  Psychological  Coordinates 

for  Each  of 

the  Sixteen  Complex  Sounds 

Predicted 

Dimensions 

Observed 

Dimensions 

' Sound 

1 

2 

1 ■ 

2 

1 

1.11 

-1.87 

1.24 

-1.46 

2 

.45 

-1.42 

1.02 

- .78 

3 

- .74 

-1.37 

- .63 

-1.01 

4 

-1.70 

-1.59 

. .99 

-1.36 

5 

1.20 

- .45 

1.28 

- .91 

, 6 

.60 

.03 

.70 

.83 

7 

- .39 

.32 

- .87 

.29 

8 

-1.34 

.30 

-1.13 

- .99 

9 

1.25 

.20 

1.34 

.10 

10 

.66 

.70 

.42 

1.40 

n 

- .30 

1.09 

- .92 

.88 

12 

-1.25 

.53 

-1.09 

- .22 

13 

1.27 

.51 

1.25 

.53 

14 

.70 

1.02 

.41 

1.70 

i 

15 

- .26 

1.44 

- .85 

1.32 

, 16 

-1.24 

.55 

-1.18 

- .32 

\ 

| 

]J 

* ■ •+  • 

«-*  * 

M 

Feature  Selection 


Page  26 


than  were  predicted  by  the  model. 

Another  discrepancy  of  interest  was  observed  in  Dimension 
2.  Here,  the  interaction  of  the  peak  ratio  and  peak  smear 
parameters  was  stronger  than  expected  theoretically.  In 
particular,  the  low  frequency  dominant  sounds  (10,  14,  11,  and 
15)  were  more  clearly  distinguished  from  the  high  frequency 
dominant  sounds  than  expected.  This  difference  would  occur  if 
the  listeners  gave  the  lower  spectral  region  (i.e.,  adjacent  to 
the  5 00  Hz  peak)  greater  weight  than  indicated  by  the  predicted 
transformation  coefficients  in  Table  2.  This  result  would  be 
expected  if  perceptual  masking  effects  are  considered. 

To  obtain  additional  information  on  the  subjective 
properties  of  this  feature,  the  results  of  the  pitch  ranking 
task  were  examined.  In  this  task  listeners  performed  a pairwise 
pitch  sorting  of  the  sounds.  If  Dimension  2 in  the  scaling 
solution  reflects  overall  stiir.ulus  pitch,  then  the  pitch  ranking 
obtained  in  the  sorting  task  should  correspond  to  the  Dimension 
2 coordinate  ranking.  Since  the  six  listeners  produced 

generally  consistent  rank  orderings  (coefficient  of  concordance 

2 

W = . 8 7,  X ( 1 5 ) = 77.9  4,  p < .001),  a rank  of  summed  ranks  was 
determined  for  each  stimulus.  Of  interest  here  was  the  finding 
that  all  eight  low  peak  smear  stimuli  (i.e.,  generated  with  50 
or  100  Hz  standard  deviations)  were  ranked  lower  in  pitch  than 
the  eight  high  smear  stimuli.  This  is  clearly  inconsistent  with 
the  observed  Dimension  2 projections.  It  is  important  to  note, 
however,  that  only  the  high  smear  sounds  had  any  significant 
energy  at  frequencies  beyond  1000  Hz  because  of  the  wider 


— 


Feature  Selection 


Page  27 


component  spacing  in  this  region.  Our  listeners  were  sensitive 
to  this  high  frequency  energy,  and  assigned  these  sounds 
appropriately  higher  pitch  rankings.  When  the  low  and  high 
smear  sounds  are  considered  separately,  the  within  set  orderings 
correspond  reasonably  well  to  the  Dimension  2 projection  ranks 
in  the  perceptual  feature  space  as  may  be  seen  in  Table  4. 


Insert  Table  4 here 

This  observation 

was  confirmed  by 

s ign if ica  n t 

Spea  rrr.an 

correlations  for 

both  sound  clusters 

(r  = 

.9  8,  £ 

< .01  and 

£=  .71,  £<  .05 

for  the  low  and 

h igh 

smea  r 

sounds , 

respectively).  This  finding  is  consistent  with  our  theoretical 
analysis  in  indicating  that  Dimension  2 reflects  pitch  in  a 
relative  rather  than  absolute  sense. 

Summary  and  Co ncl us  ions 

The  perceptual  data  considered  above  were  generally 
consistent  with  the  predictions  of  our  feature  selection  model. 
The  model  successfully  predicted  the  two  comparison  features 
that  the  listeners  used  to  generate  their  pairwise  similarity 
ratings.  In  addition,  it  was  able  to  predict  the  relative 
importance  of  these  two  dimensions.  This  suggests  that  our 
theoretical  assumptions  about  the  listener's  feature  selection 
criteria  were  reasonable.  The  model  proposes  that  listeners 
peform  a structural  analysis  of  the  variability  in  the  stimulus 
set,  and  select  features  that  enable  them  to  retain  as  much  of 
this  variability  as  possible  while  eliminating  redundancy. 


Page  28 


Table  4 

Rank  Order  of  Low  Smear  and  High  Smear  Sounds  Observed  in  the  Pitch  Ranking 
Task  and  Multidimensional  Scaling  Solution  (Dimension  2). 


Sound 

Low  Smear 

Pitch  Rank 

Sounds 

Seal ing 

Sound 

High  Smear  Sounds 

Pitch  Rank 

Scaling 

1 

1 

1 

3 

2 

2 

2 

4 

3 

4 

1 

1 

5 

2 

2 

7 

3.5 

6 

6 

6 

6 

8 

3.5 

3 

9 

3 

4 

11 

5 

7 

10 

7 

7 

12 

6 

5 

13 

5 

5 

15 

7.5 

8 

Feature  Selection 


Page  29 


Simply  put/  the  listeners  were  doing  exactly  what  we  expected 
them  to  do--that  is/  compare  the  sounds--in  a statistically 
efficient  manner. 

This  interpretation  is  consistent  with  the  process-oriented 
approach  to  auditory  feature  selection  (Howard  & Balias,  1 978). 
It  asserts  that  the  most  reasonable  questions  to  ask  about 
timbre  peception  should  address  the  feature  selection  process 
that  listeners  use  rather  than  the  feature  detectors  that  they 
use.  Indeed,  it  is  entirely  possible  that  specific  timbre 
features  do  not  exist  in  any  absolute  sense.  The  invariant  and 
predictable  aspect  of  timbre  perception  may  well  involve  a set 
of  rules  and  criteria  that  specify  a flexible  feature  selection 
process.  It  has  been  our  objective  here  to  investigate  this 
pos  s ib ll i ty . 

Although  the  present  model  enjoyed  some  success  in 
predicting  the  general  characteristics  of  the  perceptual  feature 
space,  a number  of  difficulties  exist.  In  particular,  we  noted 
that  the  fine  structure  or  distribution  of  stimuli  within 
dimensions  was  not  well  handled  by  the  model.  This  short-coming 
will  hopefully  be  eliminated  as  we  are  able  to  develop  a more 
precise  specification  of  the  proposed  transformations.  At 
present,  for  example,  we  summarize  the  contribution  of  the 
auditory  periphery  by  a loud nes s- we igh ted  1/3-octave  spectral 
analysis.  Although  reasonable  as  a first  approximation,  masking 
and  other  known  peripheral  effects  must  be  considered. 
Similarly,  we  must  clarify  the  role  of  attentional  bias  in  the 
feature  selection  process,  specify  an  appropriate  measurement 


Feature  Selection 


Page  30 


space  for  transient  or  time-varying  signals/  and  indicate  how 
the  proposed  structural  analysis  takes  place  on  a tr  ial-by- 1 r ia  1 
basis.  These  and  other  questions  clearly  call  for  additional 
research . 

Finally,  it  is  important  to  recognize  that  the  present 
research  has  a number  of  important  practical  implications  beyond 
the  theoretical  issues  discussed  above.  Once  specified,  a 
feature  selection  model  will  enable  us  to  predict  a pr  ior  i the 
features  or  sources  of  variation  that  listeners  will  use  to 
evaluate  complex  aural  signals.  Once  the  feature  structure  is 
known,  the  confusability  of  specific  stimuli  can  be  anticipated. 
In  a context  where  this  information  is  important,  e.g.,  in  the 
classification  of  aural  sonar  signatures,  preprocessors  or  other 
performance  aids  may  be  introduced  to  reduce  item,  confusability 
as  required. 

Acknowl  eccments 

This  research  was  supported  by  a contract  from  the 
Engineering  Psychology  Programs,  Office  of  Naval  Research  to  The 
Catholic  University  of  America.  James  H.  Howard,  Jr.  was  the 
principal  investigator.  The  authors  thank  Darlene  V.  Howard 
for  her  helpful  comments  on  an  earlier  draft  of  this  manuscript 
and  acknowledge  the  contribution  of  James  A.  Balias  and  James 
A.  Galgano  to  this  research. 


Feature  Selection 


Page  31 


References 

Barlow,  H.  B.  Single  units  and  sensation:  A neuron  doctrine  for 
perceptual  psychology?  Perception,  1972,  1_,  371-394. 

Green,  P.  E.,  & Carroll/  J.  D.  Ma  thema  tica 1 tools  for  appl ied 
mul tivar iate  anal ysis  . New  York:  Academic  Press,  1 976. 

Grey,  J.  M.,  & Gordon,  J.  W.  Perceptual  effects  of  spectral 
modifications  on  musical  timbres.  Journal  o f the  Acous  t ica 1 
So  cie  ty  of  Am  eri  ca,  1 97  8,  6_3'  149  3-1500. 

Harris,  R.  J.  A primer  of  mul  tivaria  te  sta  tis  tics  . New  York: 
Academic  Press,  1 975  . 

Howard,  J.  H.,  Jr.  Psychophysical  structure  of  eight  complex 
underwater  sounds.  Journa 1 of  the  Acous  tica 1 Socie  ty  of 
Ar, erica,  1 977,  6_2,  1 4 9-1  56. 

Howard,  J.  H.,  Jr.,  6 Balias,  J.  A.  Feature  selection  in 

auditory  perception  (Technical  Report  ONR-78-5).  Washington, 

D.C.:  The  Catholic  University  of  America  Human  Performance 

Laboratory,  July,  1978. 

♦ 

Kruskal,  J.  B.  Multidimensional  scaling  by  optimizing  goodness 
of  fit  to  a non-metric  hypothesis.  Ps ychometrika,  1 9 64,  2 9 , 

1-27. 

Meisel,  W.  S.  Compu  te r-or ien  ted  approaches  to  pa  t ter  n 

r e cog  n i t ion  . New  York:  Academic  Press,  1 972. 

Miller,  J.  R.,  & Carterette,  E.  C.  Perceptual  space  for  musical 
structures.  Journal  of  the  Acous  tical  Society  of  America , 
1975,  58,  711-720. 

Plomp,  R.  As  pects  of  tone  sensa  t ion : A psychophysical  study . 

New  York:  Academic  Press,  1 976. 


Feature  Selection 


Page  32 


Shepard,  R.  N.  Psychological  representation  of  speech  sounds. 
In  E.  E.  David  & P.  B.  Denes  (Eds.),  Human  com  muni  cation:  A 
un  i f ied  view  . New  York:  McGraw-Hill,  1 972. 

Stevens,  S.  S.  Perceived  level  of  noise  by  mark  VII  and  decibels 
E.  Jo urna  1 of  the  Acous  tica  1 Society  of  Am  erica  , 1 9 72,  5 1 , 
5 75-601  . 

Takane,  Y.,  Young,  F.  W.,  & de  Leeuw,  J.  Non-metric  individual 
differences  multidimensional  scaling:  An  alternating  least 

squares  method  with  optimal  scaling  features.  Psychometrika, 
1 977,  4 2,  7-67. 

von  Bismarck,  G.  Timbre  of  steady  sounds:  A factorial 

investigation  of  its  verbal  attributes.  Acous  t ica , 1 974,  3 0, 

1 46-159 . 

Young,  F.  W.  Nonmetric  multidimensional  scaling:  Recovery  of 

metric  information.  Ps  ychometr  ika,  1 970,  2^'  4 5 5-4  73. 

Young,  T.  Y.,  & Calvert,  T.  W.  Classification,  estimation,  and 
pa  t tern  recognition.  New  York:  American  Elsevier,  1 9 74  . 

Zwicker,  E.,  Flottorp,  G.,  & Stevens,  S.  S.  Critical  band  width 
in  loudness  summation.  Jour na 1 of  th e Acoust ical  So cie ty  of 


t 

America,  1 9 57,  2 9,  5 4 8-5  57. 


DIRECTOR.  EN6INLL  KING  PSYCHCLOCY 
F POOR AMS.  COOL  4 S 5 
OFFICE  or  NAVAL  RESEARCH 
300  MCRTh  QUINCY  STREET 
AF:L  1 ii'GTui-’  • VA  21217 


Ii E r i.NS E D D C U il EN  T A I ICN  C E N f E R 
CATERER  STATION 
A L f.  y i . :\'D  !\  I A > 0 A 22314 

CM  P.  A.  CHAT EL.1ER 
OUSOPE  ( EfcL.S  ) 

PENTAGON.  ROOM  3D  12? 

!,'•  £,J  INS  TORN  D.C.  20301 

I1 1 ’ EC TOP.  VEHICLE  TECHNOLOGY 
FT  LC-.L . CODE  211 

c f > i j :::  * c * n a v a l re  sea  r e h 

•CO  i.OF'TH  QUINCY  STREET 
l : :~TO.N»  VA  22217 

: • . • ; IE'..  W Z A P 0 S* S TEC H M C L 0 G Y 

• r '_•£?.:*  Hfc  . CODE  212 
SFFSCE  OF  NAVAL  RESEARCH 
i';  i.Tii  CUIi ’CY  STREET 
A: ...  I : 'SIGN  • VA  22217 

• wTC.R»  El  1 : C T R 2 w A 0 ?•!  E T ] C E> 

‘.-'Dlocy  1 f Co!'-.'C  > code  22.1 
- :•. r or  \’al  research 

I'  • TW  C . ! ' r«  0 : STREET 
...  v - .'v*  y vk  *.!*.  21  y 

i r.T  pt. i r>;  i g-y  g-j  [m; 

CODE  437 

- • rC2  OF  NAVAL  :<L SEARCH 

0 north  Qi'iNCr  street 

-INC*  ON.  VA  22217 

:■  c. t r. r y a 0 01.' s t i c t e c h n 0 1. 0 c y 

ci' c oi'E  222 
: i y : c 0 f n ,\ 1 1 a l r e cla r c h 

u .•  Nuf.  I K C ..  I N 1 SIR  E L T 
Ai  'L  Jr-  FT C i\  > V A 2 2 2 1 7 

STFiCTOR.  PHYSIOLOGY  PROGRAM 
CODE  441 

OF:  ICE'  OF  NAVAL  RESEARCH 
3 C f-OPVH  QUINCY  STREET 
ARLINL  I ON.  VA  22217 

SPECIAL  ASSISTANT  FOR  MARINE 
CMC  fl.RI  I EPS.  CODE  100M 
•2  ;'u  L 0 F I ' fi  V A L R E S E A R C H 
! •' , . 1 f’  • ! nUIN’C  ; STREET 

Af.I'C  VA  12217 


COMMANDING  OFFICER 
ONR  BRANCH  OFFICE 
ATTNJ  DR.  J.  LESTER 
BUILDING  314.  SECTION  D 
666  SUMMER  STREET 
BOSTON.  Ii  A 02210 

COMMANDING  OFFICER 
OMR  BRANCH  OFFICE 
ATTNJ  DR.  CHARLES  DAVIS 
1.536  SOUTH  CLARK  STREET 
CHICAGO.  IL  60 AC 5 

COMMANDING  OFFICER 


ONR  BRANC 

!•:  OFF 

1 L L 

A ’ 7iv  4 

r» 

• . V * 

L . ~.i 

LOVE 

1030  E 

AST 

u r . . 

STREET 

PA SADE 

N A . 

C A 

^ o 6 

CJMMAN 

DJ.v 

C OFF 

ICE? 

0::P  DR 

(”!  i . U 

rpp 

t • r 

J.  U L. 

AT  TriJ 

MR  . 

! '.  «•  i. 

AWSON 

103V  EAST 

GREF 

i-l  ST  RE  I 7 

F'ASASi:. 

N A > 

CA 

91  • Oc 

OFFICE 

OF 

MAVA 

L R ESZi-.P Cl- 

SCIENTIFJ 

C L I •- 

ip  (,i  ORPUF 

A HER  1 C 

Aiv 

Ei 

SY.  PC;.'  A- 

•A  i:‘  o ; 

Ail 

FRANC 

1 3;:  0 93003 

I REST  OR 
NAVAL  kLLIV. 

■■ |..|  .!  !'  ■ , ' 

CODE.  16 17 
WASHING U' 


•(?:;  DIVISION 


i i R , R 0 I:.  P I G * Sr...  • !■' 

OFFICE  OF  THE  CHIEF  OF  NAVAL 
OPERA!  I DNS.  0iv?87i-l 
P E R S 0 N A L L 0 G j S T l C S P L A S'  £ 

W Fi  S H I N G T 0 1 ■:  .*  D.  C < 2 0 3 5 0 

MR,  A FIN  CL  D RUBINSTEIN 
HAVA!.  HA  I ERIAL  CO:  iHAKD 

NAVMAT  0344 

WASHINGTON.  D.C.  203o0 
COMMANDS  i\‘ 

NAVAL  AIR  SYSTEMS  COMMAND 

HUMAN  FACTORS  PROGRAMS.  NAVA I R 340F 

WASHINGTON » D.C.  2 03 cl 

CJNMhNDER 

NAVAL  AIR  CVS  IE  MS  COMhAND 

c R e u slat  ::  on  : . z s i g n . n a l • i r 5313 

WASH I NOT  OP . ! . C . 2 OS 6 1 


COMMANDER 

NAVAL  ELECTRONICS  SYSTEMS  COMMAND 
HUMAN  FACTORS  ENGINEERING  BRANCH 
CODE  4701 

WASHINGTON.  D.C.  20360 


CUR  R . GIBSON 

PUF'EnU  OF  MEDICINE  X SURGERY 
AEROSPACE  PSYCHOLOGY  BRANCH 
CODE  513 

WASHINGTON.  D.C.  20372 


LCD R ROBERT  BIERSNER 
NAVAL  MEDICAL  R£D  COMMAND 
CODE  44 

naval  medical  center 

BET  HIGH A.  MD  20014 


lcdr  t,  berghage 


iv'A  L 

MEDICAL 

RESEARCH  INSTITUTE 

BEi-iA' 

I ORAL  SCI 

ENCES  DEPARTMENT 

BETKi 

SDA.  MD 

20014 

A.  E . 

BISSON 

CODE 

1 9 39 

NSRIiC 

v.( 

POCK . MD 

20084 

AMES  CUR  I 

IN 

i !A 

SEA  SYST 

EMS  COMMAND 

r-  it  r : - 

KM  EL  S Til 

AIMING  ANALYSES 

i » * \l  - 

A 074 Cl 

Ur 

.'CTO'.:.  D. 

C.  20362 

DR 

. T HUP  RAC 

HR  ACM 

•. 

CORAL  SCI 

E N C E S D E P A R T M E N T 

Nr* 

MEDICAL 

RESEARCH  INSTITUTE 

I-'Z  T ! .E 

SDA.  MD 

20014 

CHIEF 

AEROSPACE  PSYCHOLOGY  DIVISION 
NAVAL  AEROSPACE  MEDICAL  INST ’’lb 
PENSACOLA . FL  32512 

HR.  FRED  hUCKLER 
NAVY  PERSONNEL  RESEAF.'CH  AND 
DEVELOPMENT  CENTER 
MANNED  SYSTEMS  DESIGN.  CODE  311 
SAN  DIEGO.  CA  92152 

NAVY  PERSONNEL  RESEARCH  AND 
DEVELOPMENT  CENTER 
CODE  305 

SAN  DIEGO.  CA  92152 

DR.  LLOYD  HITCHCOCK 
HUMAN  FACTORS  ENGINEERING  DIVISION 
NAVAL  AIR  DEVELOPMENT  CENTER 
W A R M I N S T E P . P A 1897 4 

ROBERT  0.  BRYANT 
ASW-1 32 
N A v 3 E A 

NATIONAL  CENTER  ill 
WASHINGTON.  D.C.  2036. 


CDF:  P.  V. . CURRAN 

HUMAN  FACT; PS  ENGINEERING  DIVISION 
C SYSTEMS  DEPART"  ENT 
N A V A L A 1 R 1 1 \ ' E L 0 P : I i L E I T E i-. 
k 'T.'.'r'i  j.  NOTE R * PA  .iCL 


f.CDP  w;.  LI..  IF  i-;  MOROSE  .• 

HU M A H !■  A C T 0 l: : S E i ! t>  1 HI  E R I ;< G B .A A N C i •! 
CODE!  1226 

p a c i f i c m i c . s :r  l e t e s t c e k t e r 

POINT  MUGU . CA  930 A 2 


1 


4 


DR.  GEORGE  MOELLER 

Kt’r.A  ?•’  FACTORS  ENGINEERING  BRANCH 

SCI ' .RIME  MEDICAL  RESEARCH  LAB. 

NkvA„  SUBMARINE  BASE 

'jRDTIN.  CT  06340 

MR.  I HILL IP  ANDREWS 
NAVAl  SEA  SYSTEMS  COMMAND 
NAUSEA  0341 

WASHINGTON.  D.  C.  20362 

NAVY  PERSONNEL  RESEARCH  AND 
DEVELOPMENT  CENTER 
Mr  r;  AGE  ME  NT  SUPPORT  DEPARTMENT 
C(;Dc  210 

SD  :<  DIEGO.  CA  92152 


DR.  J.  D.  HARRIS 
NSMRL 

SUBMARINE  BASE 
GRCTON.-  CT  06340 

HUMAN  FACTORS  SECTION 
S r S T E M S E N G I N E E R 1 N G 
TEST  DIRECTORATE 
U.S.  NAVAL  AIR  TEST  CENTER 
PATUXENT  RIVER.  MD  20t/0 

HUM:  iN  FACTORS  ENGINLEr'lN'G  BRANCH 

naval  ship  research  and 

DEVELOPMENT  CENTER 
ANNAPOl  IS  DIVISION 
ANi.'APOL  I C * CD  2140: 


. 


DR.  R0BEF<7  FRENCH 

NAVAL  OCEAN  SYSTEMS  CENTER 

SAN  DIEGO r CA  92102 

HR.  JERRY  C.  LAMD 
D ILF  LAY  BRANCH 
CODE  TD111 

NAVAL  UNDERWATER  SYSTEMS  CENTER 
HEW  LONDON  t CT  06220 

NAVftL  TRAINING  EQUIPMENT  CENTER 
ATTN;  TECHNICAL  LIBRARY 
ORLANDO  r FL  32013 

HUMAN  FACTORS  DEPARTMENT 

CODE  N23.5 

NAVAL  TRAINING  equj  F'MENT  CENTER 
ORLANDO.  FL  22813 

UK.  ALFRED  F.  SMOPE 
TRAINING  ANALYSIS  AND  EVALUATION 
-AVAL.  TRAINING  EQUIPMENT  CENTER 
CODE  N-OOT 
ORLANDO.  FL  32813 

DR.  GARY  POCCK 

0 F f R ATI 0 i i B R E SEA R C H D E P A R T M E N T 
AL  POST  GRADUATE  SCHOOL 
MONTEREY  r CA  939*0 


TECHNICAL  DIRECTOR 
U.S.  ARMY  HUMAN  ENGINEERING  LA 
ABERDEEN  PROVING  GROUND 
ABERDEEN f MD  21005 

U.S.  ARMY  AEROMEDICAL  PE SEA PC' . . 
A T T N : Cl  T 0 E R A I.  D P.  K R UE3  E F - 
FT.  RUCKER.  AL  3 4,2  6 2 

U.S.  AIR  FORCE  OFFICE  OF 
SCI  ENT IFIC  RESEARCH 
LIFE  SCIENCES  DIRECTORATE . NL 
BOLL  I N G A I R F 0 R C E B A £ E 
U A S H I N G T 0 N . D . C . 2 0 3 3 2 

DR.  ION ALP  A.  TOPMJLLEP 
CHIEF  .»  SYSTEMS  E N C I N E E r IN G P R A ! ! 
!-!  J r A U E N 0 1 1 1 E E R I N 0 D I V I S J 0 N 
US  A “ AhRL/HES 

WRI3-.T-PA7  !'!••  RSON  A Ft.  OH  4 5 12; 

A I F UN  j V I RSI  : Y i.  IDF:  .FRY 

11 A - J 0 I.  A .!  N F 0 R C E BA S II  . ”, L.  3 1 1 1 1 

Dr*.  ROBERT  WILL.  TOES 

K V : F A C T 0 R S L A B 0 R A T 0 R Y 

V I R 5 1 N I A p 0 Y T E C H N 1 C I f l H f 1 TUT  E 

13C  UHITTFI 0 2R£  HALL 

t;..A  O'-SBLRC  > VA  2^0<:  3 


*1"-.  WARREN  LEWIS 

AN  FI ? 1 C J n E E R I N G I R A N C H 
CCDS  S231 

MA'.'-vL  OCEAN  S /STEMS  CENT!: 
-• ! Ill  EGO » CA  921.52 


P • r T OF  J ID  JSTR...AL  f NGINEERIf'G 
V I F G I N J A :: ' 0 1.  f r £ C w NIC  I \ S T3  TL'i  I. 

am:  s .'I  : u 'Iver t 

11 A 2-!0a:: 


BP.  A.  L.  SI.  Af  NOSKY 
SC I ENT IF] C ADVISOR 
COMMANDANT  OF  THE  MARINE  CORPS 
CODE  RD-1 

J ASHING! ON . D.C.  20380 
MR.  J.  BARBER 

MGS»  DEPARTMENT  OF  THE  ARMY 
LAPE-PBR 

WASHINGTON  t D.C.  20546 

DR.  JOSEPH  ZEIDNER 
TECHNICAL  DIRECTOR 
U.S.  ARMY  RESEARCH  INST. 

5001  El  SEN  IOWER  AVENUE 
AL EX ANDR I A . VA  22333 

DIRECTOR f ORGANIZATIONS  AND 
LY STEMS  RESEARCH  LABORATORY 
u.  S.  ARMY  RESEARCH  INST I FU'fE 
5 ■'* j 1 EISENHOWER  AVENUE 
• ■*. ..  L- . " ) , i ■ i A.  Va  223,.  3 


DR.  ART  KJR  1,  SIEGEL 
A p -•  l I D P S T 0 H C,  L 0 G I C A L.  ! ‘ E K V T C F.  £ 
404  EAST  LANCASTER  STREET 
WAYNE.  PA  190S7 

DR.  ROBERT  ».  MACK It 

HU".-iN  ' FACT  MRS  RE  SEA-1  .'  INC . 

5 DAUt-CN  AVE. 

GO:. ETA.  CA  92017 

I 'ft.  GERSHUN  WEI..  THAN 
PS  RCEPT  R 0; FI. CS.  INC. 

<:.;•*  1 VAR.!  PI.  AVENUE 
L C B '...  T N V • H I L L S • C A 9 1 3 L *'1 

j . swp’T’s 

K<-  . DLL'  .'vi.  \ £ NEW  i -i  - INC 
5'.'  r LULION  STREET 
OR  r ft  T n3E  • riA  02130 


DR.  ALPHONSE  CHAP AN IS 
DEPT  OF  PSYCHOLOGY 
THE  JOHNS  HOPKINS  UNIV 
BALTIMORE*  MD  21218 

DR.  MEREDITH  CRAWFORD 
GEORGE  WASHINGTON  UNIV. 

SUITE  805 

2101  L ST.  , N.  W. 

WASHINGTON.  D.C.  20037 

DR.  WILLIAM  HOWELL 
DEPARTMENT  OF  PSYCHOLOGY 
RICE  UNIVERSITY 
HOUSTON,  TX  77001 

J J U R N A L S U P P LEM E N T A BST R A C T S E R V I C E 
A M E R 1 C A N P S Y C H 0 L.  0 G I C A L A S S 0 C I A 7 1 0 N 
1200  1 7TH  STREET,  N.  W. 

WASHINGTON , D.C.  20036 

« 

DR.  ROBERT  G.  RACHEL. LA 
UNIVERSITY  OF  MICHIGAN 
DEPT.  OF  PSYCHOLOGY 
HUMAN  PERFORMANCE  CENTER 
330  PACKARD  ROAD 
ANN  ARBOR,  MI  AS  10*1 

DR.  T.  B.  SHERIDAN 
» EL  P T 0 F M E CL  H A N 1 C A L E N G I N E E R I N G 
M A S S A C E ! U S L T TS  INST I T U T E 0 F 7 E C K N 0 L 0 G Y 
C A h B R l D G E , M A 0 2 1 3 9 

DR . JESSE  ORLANSKY 
r ST I TUT E FOR  DEFENSE  ANALYSES 
TOO  ARMY- NAVY  DRIVE 
A R L I l J G T 0 N v V A 2 2 2 0 2 

DR.  STEPHEN  J.  ANDRIOLE 
A I VANCE 1 1 RESEARCH  PROJECTS  AGENCY 
.!  100  WILSON  BE VD 
ARLINGTON r VA  2220? 


DR.  STANLEY  DEUTSCH 
OFFICE  OF  LIFE  SCIENCES 
HQS » NASA 

600  INDEPENDENCE  AVE. 

WASHINGTON,  D.C.  20546 

DR.  J.  MILLER 
NASA 

11400  ROCKVILLE  PIKE 
ROCKVILLE,  MD  20852 

DR.  WILLIAM  A.  MC  CLELLAND 
HUMAN  RESOURCES  RESEARCH  OFFICE 
300  N.  WASHINGTON  STREET 
ALEXANDRIA,  VA  22314 

DR . WILLIAM  R.  DUAL 
DEPT  OF  PSYCHOLOGY,  GARTLEY  HALL 
UNI V OF  HAWAII  AT  MANGA 
HONOLULU,  HI  96822 

DR.  U.  S.  VAUGHAN 
OCEANAUT  ICS,  INC 
A 22  6TH  ST 
ANNAPOLIS  MD  21403 

DR.  DAVID  GETTY 
BOLT , B ERA NEK  £ NEWMAN 
50  MOULTON  STREET 
CAMBRIDGE,  MA  02138 

DIRECTOR,  HUMAN  FACTORS  WING 
DEFENSE  & CIVIL  INSTITUTE  OF 
E U V I R 0 N H E N T A I.  M E D I C I N E 
F.O.  BOX  2000 

I o W N S V 1 1...  L E , T 0 R 0 N T 0 , 0 N T A R 1 0 
CANADA 

DR.  A . D . BADDELEY 

DIRECTOR , APPLIED  PSYCH.  UNIT 

M E D I C A L.  RE S E A R C H C 0 U N C I L 

I 5 CHAUCER  ROAD 

CAMBRIDGE,  CD2  2EF 

ENG  LAi  HD 


